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1.0  INTRODUCTION 


Deadlock  detection  is  an  important  and  interesting 
problem.  It  is  important  to  detect  and  break  up 
deadlock  situations  in  the  interest  of  system 
throughput.  The  detection  problem  is  interesting 
in  a  graph-theoretic  sense,  and  also  because  dis¬ 
tributed  and  parallel  processing  techniques  are 
appl icable. 


A  deadlock  situation  is  the  possible  result  of 
competition  for  resources,  for  example  data  base 
transactions  requesting  exclusive  access  to  files. 
Deadlocks  can  be  prevented  when  resource  requests 
are  always  granted  with  system-wide  atomicity,  but 
in  a  distributed  system  such  a  guarantee  is  not 
practical:  There  are  time  delays  in  the  inter¬ 
action  of  sites  of  a  distributed  system. 


1.1  Organization 


This  paper  is  divided 
review  of  prevalent 
detection  algorithms, 
used  for  this  paper, 
detection  procedure, 
mentation  issues,  and 


into  six  parts.  Part  1  is  a 
terminology  and  deadlock 
Part  2  defines  the  model 
Part  3  presents  the  deadlock 
Part  4  discusses  some  imple- 
Part  5  is  the  concluding 


section.  Part  6  is  an  appendix  devoted  to 
correctness  proof  of  the  detection  procedure. 


the 


The  remainder  of  Part  1  defines  and  classifies  the 
subject  of  deadlock  detection.  Section  1.2  is  a 
review  of  the  terminology  for  deadlock  detection 
in  the  distributed  data  base  environment.  Section 
1.3  contains  a  classification  of  deadlock  models. 
Section  1.4  completes  the  introduction  with  a 
review  of  some  related  solutions  to  deadlock 
detection. 


1.2  Transaction  Model 

The  paradigm  used  in  this  paper  for  discussion  of 
deadlock  detection  is  due  to  Menasce  and  Muntz 
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[2].  Briefly,  data  base  transactions  present 
resource  requests  to  contro I  I  era.  A  resource 
request  may  be  a  request  to  lock  files  or  may  have 
more  abstract  meanings.  A  transaction  is  blocked 
from  the  time  it  presents  a  request  to  a  control¬ 
ler  until  the  controller  grants  the  request,  and 
the  transaction  becomes  active.  A  resource 
request  can  be  local,  or  refer  to  a  resource  at 
another  site,  in  which  case  the  transaction  is 
distributed.  A  distributed  transaction  is  imple¬ 
mented  by  transaction  agents,  each  of  which  is  the 
local  agent  for  a  given  transaction  at  one  site. 
Inter-s ite  resource  requests  are  always  between 
two  agents  of  the  same  transaction. 

A  transaction  wait-for  graph  (TWFG)  is  a  model  of 
resource  requests.  The  vertices  of  the  graph  are 
associated  with  transaction  agents.  Directed  edg¬ 
es  in  the  graph  represent  "wait-forM  relationships 
between  transactions  agents.  A  vertex  with  outgo¬ 
ing  edge(8)  is  a  blocked  transaction  agent.  A 
cycle  in  the  graph  indicates  deadlock  —  if  a 
cycle  is  defined  carefully,  as  wd  do  in  Section 
2.4. 


1.3  Request  Mode  I s 


Figure  1  is  a  TWFG  graph  representing  a  deadlock 
situation.  This  example  has  four  transactions, 
T1-T4,  implemented  by  seven  transaction  agents. 
The  directed  edge  from  vertex  Tlx  to  vertex  T2j 
shows  that  agent  Tlx  is  blocked.  Vertex  T2Z  has 
no  outgoing  edges,  and  therefore  is  an  active 
(non-blocked)  transaction.  Vertex  Tlj  has  two 
outgoing  edges,  and  this  indicates  that  trans¬ 
action  T1  has  two  outstanding  resource  requests, 
and  both  must  be  satisfied  before  transaction  T1 
becomes  active.  This  type  of  resource  request  is 
called  an  AND- request,  because  transaction  T1  must 
have  resource  1  and  resource  2. 
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j  Figure  1.  Example  of  TWFG.  j 

i — _ _ _ i 


The  AND- mode  I  is  a  term  used  to  signify  that  all 
resource  requests  are  AND- requests.  The  AND-model 
has  been  the  traditional  view  of  resource  requests 
in  distributed  data  base  systems. 

An  alternative  model  of  resource  requests  is  the 
OR-mode I .  A  request  for  numerous  resources  is 
satisfied  by  the  granting  of  any  requested 
resource.  For  example,  assume  that  a  transaction 
waits  for  resource  1  or  resource  2.  Suppose 
resource  1  is  granted  to  this  request;  then  the 
transaction  becomes,  active,  and  the  request  for 
resource  2  is  cancelled. 

In  the  OR-mode I,  a  cycle  is  insufficient  for  dead¬ 
lock  detection.  To  see  this,  suppose  all  requests 
in  Figure  1  are  OR- requests;  then  transaction  T1 
1 8  not  deadlocked  because  T22  has  no  outgoing  edg¬ 
es.  In  terms  of  the  TWFG,  a  knot  will  indicate 
deadlock.  A  knot  is  a  structure  defined  as  fol¬ 
lows:  Vertex  X  is  in  a  knot  Iff  a  path  from 
vertex  X  to  vertex  Y  implies  that  there  is  also  a 
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path  from  Y  to  X.  In  other  words,  in  a  knot 
there  are  no  "dead-ends"  in  the  graph. 

The  AND/OR  model,  also  called  the  communication 
model,  is  a  generalization  of  the  two  previous 
models.  An  AND/OR- request  may  specify  any  combi¬ 
nation  of  "and"  and  "or"  in  the  resource  request. 
For  example,  a  request  for  "(S  and  (T  or  U))  or  V" 
is  possible,  and  S,  T,  U  and  V  may  exist  at  dif¬ 
ferent  sites  in  the  system. 

1.4  Related  Work 

The  solution  to  deadlock  detection  in  a 
non-distributed  system  is  well  studied  [10].  It 
is  possible  to  directly  implement  this  solution  in 
a  distributed  system  if  a  central  scheduler  is 
used,  but  a  central  scheduler  is  not  practical. 
The  centralized  strategy  could  theoretically  be 
implemented  by  broadcasting  resource  events  (re¬ 
quests,  grants)  with  time-stamps  as  In  Lamport 
[9].  All  controllers  would  have  updated  global 
views  and  deadlock  could  be  detected. 

A  variation  of  the  time-stamp  strategy  is  proposed 
in  [1]:  One  controller  is  designated  as  the  dead¬ 
lock  detector,  and  ail  other  controllers  send 
pieces  of  the  TWFG,  which  are  time- stamped,  to  the 
designated  central  control ler.  A  consistent,  glo¬ 
bal  TWFG  is  assembled  to  detect  deadlock.  The 
drawbacks  associated  with  a  central  contol ler  are 
ameliorated  by  structuring  the  controllers  into  a 
hierarchy,  which  reduces  the  message  traffic,  as 
in  [2].  To  some  extent  the 
hierarchy-of-control lers  achieves  distribution  of 
processing,  although  the  centralized  strategy 
rema i ns. 

A  completely  distributed  algorithm  to  detect  dead¬ 
lock  by  construction  of  a  TWFG  is  non-trivia  I,  as 
numerous  counter-examples  show  [1,3,5].  The  coun¬ 
ter-examples  typically  show  that,  due  to  time 
delays  in  the  transfer  of  information  between  con¬ 
trollers,  an  incorrect  TWFG  may  be  constructed. 
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In  this  way,  deadlock  may  90  undetected  or  be 
fa  I  so  I y  detected. 

The  deadlock  detection  procedure  in  this  paper 
does  not  follow  the  strategy  of  assembling  and 
manipulating  graph  structures.  Instead,  the 
underlying  graph  structure  is  incorporated  in  the 
way  the  program  is  distributed.  Two  previous 
papers  employed  this  theme:  A  deadlock  detection 
procedure  specific  to  the  AND-model  appears  in 
[6].  Unlike  earlier  solutions,  the  procedure  does 
not  construct  a  TWFG,  even  in  reduced  form.  In  a 
related  work  [7],  an  AND/OR  model  deadlock 
detection  procedure  is  developed.  This  AND/OR 
deadlock  detection  procedure  is  distributed,  mes¬ 
sage-based,  and  does  not  construct  a  TWFG.  The 
procedure  efficiently  detects  deadlock  in  the 
OR-model,  but  is  less  efficient  in  an 
AND/OR- mode  I :  Deadlock  in  the  AND/OR-model  is 
detected  by  repeated  application  of  a  test  for 
OR-model  deadlock,  which  will  eventually  result  in 
detection  of  deadlock  for  the  AND/OR-model. 

A  non-d i stributed  solution  to  the  AND/OR  model, 
which  contains  several  interesting  examples  of 
AND/OR  resource  requests,  is  given  by  Beeri  and 
Obermarck  [4],  The  expressive  power  of  the  AND/OR 
model  is  greater  because  "non-specific"  requests 
are  permitted.  For  instance,  a  request  for  any  M 
available  resources,  from  a  pool  of  size  N,  can  be 
represented  by  an  AND/OR  request.  In  a 
multiple-copy  distributed  data  base,  a  transaction 
may  request  the  locking  of  any  available  copy  such 
as  the  example  in  the  abstract  of  this  paper  sug¬ 
gests.  Our  contribution  is  to  present  a 
distributed  solution  to  the  problem  defined  by 
Beeri  and  Obermarck. 
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2 . 0  PROCESS  MODEL 


This  section  contains  definitions  that  enhance  the 
basic  model  presented  in  Section  1.2.  The  revised 
model  definition  is  used  to  accurately  define 
deadlock  and  support  the  discussion  of  the  proce¬ 
dure  in  Part  3.  Sections  2.1  and  2.2  define  proc¬ 
esses  and  messages,  which  are  the  basic  language 
for  the  deadlock  detection  procedure.  Section  2.3 
describes  the  wait-for  graph  for  processes,  and 
assigns  color  attributes  to  the  edges.  In  Section 
2.4  is  a  definition  of  deadlock  in  terms  of  col¬ 
ored  edges.  Section  2.5  details  the  relationship 
between  transactions  and  processes  and  Section  2.6 
is  a  list  of  behavioral  axioms  for  processes. 

2. 1  Processes 

In  this  paper,  the  deadlock  detection  procedure 
does  not  directly  refer  to  transactions  or 
agents,  but  instead  to  processes.  We  use  the  term 
"process"  to  extend  the  idea  of  transaction 
agents.  Like  transaction  agents,  processes  repre¬ 
sent  transactions.  A  process  can  be  thought  of  as 
a  logical  entity,  manipulated  by  the  controller, 
much  as  an  operating  system  manipulates  internal 
tables  in  the  service  of  jobs.  The  reason  for  the 
distinction  is  that  the  mapping  from  processes  to 
transaction  agents  is  many-to-one,  not  one-to-one, 
i.e.,  the  mapping  may  require  that  a  transaction 
be  represented  by  numerous  processes  at  a  single 
site,  as  we  will  illustrate  in  Section  2.5. 

To  simplify  the  following  discussion,  the  language 
is  "a  process  does  some  action"  or  "a  process 
waits".  In  fact,  it  should  be  understood  that  the 
controller  is  performing  actions  on  behalf  of 
transactions,  or  to  detect  deadlock.  "A  process 
sends  a  message  to  another  process"  means  that  a 
controller  sends  a  message  to  another  controller, 
only  if  the  processes  are  at  different  sites. 
Sending  messages  between  processes  at  a  single 
site  is  an  internal  operation  for  a  controller. 
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2.2  Messages 


Resource  requests  and  the  grants  of  resources  are 
modeled  by  messages.  To  request  a  resource,  a 
process  sends  a  request  (message)  to  the  process 
that  holds  the  resource.  After  sending  the 
request  the  process  waits  until  a  grant  (message) 
is  received  from  the  relinquishing  process.  An 
important  assumption  about  messages  is  the  order 
of  arrival.  We  assume  that  all  messages  sent  from 
one  process  to  another  will  definitely  arrive,  and 
in  the  order  sent  (FIFO,  loss- free  channels 
between  processes).  No  assumption  about  the  rate 
of  message  traffic  is  made.  Other  messages  will 
be  defined  in  Part  3  as  part  of  the  detection  pro- 
cedu  re . 

2.3  Process  Wait-For  Graph 

Although  a  wait-for  graph  is  not  constructed  by 
the  deadlock  detection  procedure,  it  will  simpli¬ 
fy  the  discussion  to  refer  to  this  framework.  The 
Process  Wait-For  Graph  (PWFG)  is  a  theoretical 
construct  which  reflects  the  true,  instantaneous 
state  of  the  distributed  system.  Operations  on 
the  PWFG  are  always  theoretical  operations.  Each 
vertex  in  the  PWFG  represents  a  process.  Assume 
that  each  process  has  a  globally  unique  name, 
called  the  process-ID,  which  is  used  as  the  vertex 
identifier  in  the  PWFG.  A  directed  edge  in  the 
PWFG  indicates  that  a  process  is  waiting  for  a 
grant  message.  Edges  have  co I  ora,  attributes  that 
define  the  state  of  an  edge. 

Given  processes  v  and  w,  edge  (v,w)  is: 

gray  Edge  (v,w)  is  gray  from  the  time  v  sends 

w  a  request  message  until  w  receives  the 
request. 

black  Edge  (v,w)  is  black  from  the  time  w  has 
received  a  v's  request  until  w  sends  a 
grant  to  v. 
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wh i te  Edge  (v,w)  is  white  from  the  time  w 
sends  a  grant  to  v  unti I  v  receives  the 
grant. 

This  notation  is  fully  developed  in  [6].  To  sum¬ 
marize  some  properties  of  edges, 

•  Edges  are  gray  when  created. 

•  A  gray  edge  will  turn  black  after  a  finite, 
arbitrary  time. 

•  A  white  edge  will  disappear  after  a  finite, 
arbitrary  time. 

The  normal  course  of  coloring  for  an  edge,  over 
time,  is  the  sequence:  non-existence  -*•  gray  -*• 
black  -*■  white  -*■  non-existence.  Attaching  colors 
to  edges  permits  a  precise  definition  of  deadlock, 
as  the  following  section  shows. 

2 . 4  Dead  I ock  Situation 


In  the  AND-model,  a  deadlock  is  identified  as  a 
cycle  of  black  edges  in  the  PWFG.  More  precisely, 
any  combination  of  gray  and  black  edges  in  a  cycle 
is  deadlock  because  gray  edges  turn  black  [6],  In 
the  OR-model,  a  deadlock  is  a  black  knot.  There 
does  not  appear  to  be  a  familiar  term  to  categor¬ 
ize  deadlock  in  the  AND/OR- mode i ,  but  the  idea  is 
similar:  A  static,  black,  component  arises  in  the 
PWFG,  which  will  persist  until  broken  externally. 

2.5  Mapping:  Transaction  -»  Process 

To  simplify  the  presentation  in  this  paper,  the 
PWFG  is  restricted  as  follows:  A  process  may  have 
an  OR-request  or  an  AND- request.  This  limitation 
does  not  restrict  the  power  of  the  AND/OR  model 
for  the  following  reason.  Although  a  transaction 
may  submit  any  type  of  AND/OR- request,  the 
AND/OR- request  can  be  represented  as  a  network  of 
processes  restricted  to  AND- requests  and 
OR-requests.  The  mapping  is  a  representat i on  of 
the  AND/OR- request  in  a  regular  form,  such  as  dis¬ 
junctive  normal  form.  Figure  3  shows  a  TWFG  and 
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the  corresponding  PWFG  as  an  example  of  this  map 


c  (  D 


TWFG 


PWFG 


Note:  The  dashed  line  between  edges 
is  used  to  indicate  an  AND- request. 
In  the  TWFG,  transaction  A  waits  on 
B  or  (C  and  D). 

Figure  3.  TWFG  and  PWFG  Mapping. 


Process  Behavior 


This  section  illustrates  the  effect  of  process 
behavior  on  the  PWFG.  An  active  process  is  a  ver¬ 
tex  with  no  outgoing  edges  in  the  PWFG.  A  blocked 
process  is  a  vertex  with  outgoing  edges.  A 
blocked  process  may  send  neither  grant  nor  request 
messages.  No  process,  active  or  blocked,'  may 


TRH83a-9 


receive  more  than  one  message  at  any  instant. 
When  a  blocked  process  receives  a  grant  message, 
an  edge  in  the  PWFG  disappears  and  there  are 
several  possibilities  for  the  blocked  process: 

•  No  outgoing  edges  remain  and  the  process  is 
active  by  definition. 

•  Some  outgoing  edge(s)  remain.  There  are  two 
cases: 

—  The  blocked  process  has  an  AND- request. 

The  process  remains  blocked. 

—  The  blocked  process  has  an  OR-request. 
All  of  the  remaining  outgoing  edge(s)  are 
deleted  from  the  PWFG,  and  the  process  is 
act i ve. 

All  of  the  vanishing  edge  scenarios  listed  above 
are  considered  to  be  instantaneous  state  transi¬ 
tions  for  the  PWFG.  That  is,  all  of  an 
OR-request' s  edges  disappear  simultaneously  in  the 
PWFG  (cancellation). 
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3.0  DEADLOCK  DETECTION  PROCEDURE 


A  basic  description  of  the  deadlock  detection  pro¬ 
cedure  follows.  The  correctness  proof  is  found  in 
Part  6,  and  some  implementation  considerations  are 
in  Part  4.  Section  3.1  presents  notation  used  to 
describe  the  procedure.  An  informal  description 
of  the  detection  procedure  appears  in  Section  3.2. 
Section  3.3  is  divided  into  sub- sect  ions  that  spe¬ 
cify  the  procedure  by  process  behavior  rules.  In 
Section  3.4  is  an  example  sequence  of  an  execution 
of  the  detection  procedure. 

3 . 1  Prel imi nar i es 

The  deadlock  detection  procedure  is  a  distributed 
program.  The  components  of  the  program  are: 

•  Messages:  grants,  requests,  queries  and 

replies.  Grants  and  requests  are  discussed  in 
Section  2.2.  Queries  and  replies  are  used 
only  by  the  detection  procedure  and  are  dis¬ 
cussed  below. 

•  Local  Memory:  Associated  with  each  process  are 
two  lists,  the  outgoing  query  list  (OQ-list) 
and  the  incoming  query  list  (IQ-list).  These 
lists  will  hold  images  of  queries  during  the 
deadlock  detection  procedure.  Initially,  both 
I i sts  are  empty. 

Each  query  and  reply  message  has  a  variable  length 
label.  The  label  is  a  string  of  process-IDs. 
Suppose  {s,t,u,v,w}  are  process  IDs.  A  possible 
label  is  <twsu>.  Some  operations  on  labels  are 
defined: 

Catenation:  Given  two  labels,  A  and  B,  then  A*B 
denotes  the  catenation  of  the  strings. 
Example:  A  =  <uvt>,  B  =  <sw>,  A*B  = 

<uvt>*<sw>  =  <uvtsw>. 

Prefix:  Given  two  labels,  A  and  C,  then  A  is 

prefix  or  equal  to  C,  A  a  C,  iff  A  =  C 
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or  C  can  be  written  as  C  =  A»B,  for 
some  label  B.  Example:  A  =  <uvt>,  C  = 
<uvtsw>  »  A  s  C. 

The  notation  for  a  query  is  Q(B,s),  where  B  is  the 
label  of  query  Q  which  was  sent  by  process  s. 
Query  messages  are  sent  in  the  direction  of  edges 
in  the  PWFG.  R(C,t)  denotes  reply  R  with  label  C, 
sent  by  process  t.  Reply  messages  are  sent  in  the 
reverse  direction  of  edges  in  the  PWFG. 

3.2  Procedure  Overview 

A  controller  initiates  the  procedure,  and  some 
period  of  detection  activity  follows  (sending  mes¬ 
sages,  updating  lists  and  so  on).  When  and  how 
often  the  procedure  is  initiated  is  the  topic  of 
Section  4.1.  When  the  activity  subsides,  the  pro¬ 
cedure  will  either  have  detected  deadlock  or  not. 

Queries  are  used  by  the  procedure  to  search  the 
PWFG  for  non-blocked  processes.  Repl ies  are 
returned  to  the  senders  of  queries  to  say  "search 
failed."  When  all  searches  fail,  then  all  replies 
are  returned  and  deadlock  will  be  detected. 

If  grant  messages  or  non-blocked  processes  are 
encountered  in  the  execution  of  the  detection  pro¬ 
cedure,  then  the  system  is  not  deadlocked,  and 
therefore  the  procedure  will  not  detect  deadlock. 

When  a  query  arrives  at  a  process,  then  some 
search  is  to  be  extended:  The  IQ  and  OQ- I i sts  are 
updated,  and  a  new  query  is  sent  on  outgoing 
edges.  The  IQ- I i st  is  a  record  of  the  queries 
received  by  a  process.  The  OQ- I i st  is  a  record  of 
queries  sent  by  a  process.  Both  lists  are  lists 
of  "on-going"  searches:  A  global,  instantaneous 
view  of  all  IQ  and  OQ-lists  in  the  distributed 
system  would  describe  the  state  of  the  detection 
procedure  at  any  instant. 

The  propagation  of  queries  continues  until  a  cycle 
is  detected.  The  cycle  will  be  detected  because  a 
process  will  match  an  arriving  query  with  '  some 
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entry  in  its  IQ-list.  In  a  sense,  that  query  has 
been  "seen  before"  at  that  process.  At  such 
times,  reply  messages  are  generated  and  returned 
to  the  query  senders. 

The  queries  can  be  distinguished  by  their  labels 
for  matching  purposes,  i.e.,  to  generate  the  "seen 
before"  condition.  The  query  propagation  step 
creates  new  queries  with  new  labels  so  that  each 
label  carries  a  "propagation  history." 
Eventually,  this  history  will  become  exhaustive, 
giving  rise  to  "seen  before"  for  any  process. 

When  a  process  receives  a  reply,  then  the  IQ  and 
OQ- I i sts  are  updated  (entries  removed).  The  reply 
signals  a  "search  failed"  for  activity  in  the 
PWFG,  and  the  process  propagates  the  reply  to  oth¬ 
er  processes.  Each  reply  is  given  a  label  to 
match  a  correspond i ng  query  label,  that  is,  the 
"seach  failed"  message  specifies  which  search 
failed,  and  which  entry  of  the  OQ- I i st  must  be 
de I eted . 

3.3  Process  Behavior 


In  the  following  sections,  there  is  frequently  a 
bifurcation  of  process  actions,  depending  on 
whether  a  process  has  an  OR- request  or  an 
AND-request.  For  convenience,  the  action  for  an 
OR-requesti ng  process  will  be  prefaced  with  {OR}, 
and  for  an  AND- request i ng  process,  {AND}. 

Only  idle  (blocked)  processes  take  part  in  the 
deadlock  detection  procedure.  Active 
(non-blocked)  processes  may  simply  ignore  all 
request  and  reply  messages,  which  is  implicit  in 
the  statement  of  the  following  rules. 

3.3.1  Initiation  of  the  Procedure 

When  a  control ler  suspects  a  deadlock  situation, 
then  an  artificial  process,  called  the  i n i t i ator, 
is  created  to  detect  deadlock.  Assume  that  each 
time  a  controller  creates  a  new  initiator,  it  is 
with  a  different  process-ID  from  all  previous  ini- 
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tiator  or  other  process-IDs. 
edges  in  the  PWFG. 


The  initiator  has  no 


Suppose  process  i  is  the  initiator,  and  process  w 
is  suspected  of  being  deadlocked.  Initiation  con¬ 
sists  of  process  i  sending  Q(<i>,i)  to  process  w. 
Process  i  will  take  no  further  action  during  dead¬ 
lock  detection.  Later,  when  R(<i>,w)  is  received 
by  process  i,  then  the  controller  declares  dead¬ 
lock  for  process  w. 

3.3.2  On  Receiving  a  Query 


If  a  given  query  has  been  "seen  before"  at  a  proc¬ 
ess  then  a  reply  is  generated,  an  action  we  call 
reflection.  Queries  not  "seen  before"  cause  new 
queries  to  be  generated,  an  action  called  exten¬ 
sion.  Suppose  an  arbitrary  process  v  receives 
query  Q(B,w).  Process  v  then  searches  its  IQ- I i st 
for  an  entry  Q(T,s)  such  that  T  ss  B.  There  are 
two  outcomes  of  the  search: 


{  This  action  is  called  reflection.  } 

There  is  some  Q(T,s)  that  satisfies  the 
search.  Process  v  takes  the  following  action: 
Send  R(B,v)  to  process  w; 

{  This  action  is  called  extension.  } 

The  search  for  Q(T,s)  fails.  First,  Q(B,w)  is 
added  to  process  v's  IQ- list.  Suppose  edges 
(v,Xj ),  (v,x2),  ...  (v,xK)  exist  in  the  PWFG. 
Process  v  now  takes  one  of  the  following 
actions: 

a .  {OR} 

FOR  j:=  1  TO  k  DO 
BEGIN 

Send  Q( B, v)  to  Xj ; 

Add  Q(B,v)  to  the  OQ-list; 

END; 

Notice  that  Q(B,v)  may  be  added  to  the 
OQ-list  numerous  times.  This  is  inten¬ 
tional:  It  is  important  that  Q(B,v) 

appear  k  times  in  the  OQ-list  as  a  result 


of  this  step. 
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b.  {AND} 


FOR  j:=  1  TO  k  DO 
BEGIN 

Send  Q(B«<Xj>,v)  to  Xj  ; 

Add  Q(B«<Xj>,v)  to  the  OQ- I i st; 
END; 


3.3.3 


Rece  i  v  i  n< 


Grant 


Section  2.6  specifies  process  behavior  upon 
receiving  grant  messages.  In  this  section,  the 
rules  are  extended  to  accomodate  the  deadlock 
detection  procedure.  Suppose  process  v  receives  a 
grant  from  process  w.  In  addition  to  the  actions 
given  in  Section  2.6,  one  of  the  following  steps 
is  taken: 


1.  {OR}  The  OQ-list  is  made  an  empty  list. 

2.  {AND}  Entries  of  the  form  Q(T*<w>,v)  are 
deleted  from  the  OQ-list.  That  is,  all  record 
of  queries  sent  to  w  i s  erased. 


Rece  i  v  i  n< 


When  a  process  receives  a  reply  message,  it  is  in 
response  to  some  query  sent  earlier.  The  reply 
will  be  propagated  only  if  no  grant  messages 
arrived  since  the  query  was  sent,  otherwise  the 
reply  is  invalid.  At  an  AND- request i ng  process, 
any  valid  reply  causes  immediate  propagation.  At 
an  OR- request i ng  process,  the  reply  propagation  is 
delayed  until  all  valid  replies  have  arrived. 


Suppose  process  v  receives  R(C,s).  Then  process  v 
searches  its  OQ-list  for  an  entry  of  the  form 
Q(C,v).  If  the  search  fails,  then  no  action  is 
taken  by  process  v,  and  the  reply  is  ignored. 
Otherwise  process  v  takes  one  of  the  following 
steps,  which  are  col  I  at i on  actions: 

1.  {OR}  The  entry  Q(C,v)  is  deleted  from  the 
OQ-list.  After  this  deletion.  Process  v  aga i n 
searches  its  OQ-list  for  an  entry  of  the  form 
Q(C,V): 
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•  If  the  search  fails,  then  process  v 
locates  the  entry  Q(C,x)  in  process  v's 
IQ- list.  Process  v  deletes  the  entry 
Q(C,x)  from  the  IQ-list,  and  sends  R(C,v) 
to  process  x. 

•  If  the  search  turns  up  some  Q(C,v)  in  the 
OQ-list,  then  process  v  takes  no  further 
action. 

2.  {AND}  The  entry  Q(C,v)  is  deleted  from  the 
OQ-list.  Now  process  v  searches  its  IQ-list 
for  an  entry  Q(B,x)  such  that  C  =  B»<s>: 

•  If  the  search  fails,  then  process  v  takes 
no  further  action,  i.e.,  R(C,s)  is 
ignored. 

•  If  the  search  found  Q(B,x),  then  process  v 
deletes  Q(B,x)  from  the  IQ-list,  and  sends 
R(B, v)  to  process  x. 


Example  of  Procedure  Execution 


In  figure  4  is  the  PWFG  used  for  this  example. 
Process  i  is  the  initiator  of  the  deadlock 
detection  procedure.  Process  x  has  the  only 
AND- request  in  the  PWFG. 
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Figure  5  illustrates  a  truce  of  one 
cution  of  the  procedure.  Each  el 
table  shows  an  event  for  a  process, 
actions  described  within  an  event  a 
to  happen  concurrently.  The  notati 
means  that  query  Q(B,x)  is  sent  by 
process  y.  MQ(B,x)«yM  means  that 
arrives  at  y.  The  sequence  does  not 
porting  updates  to  the  IQ  and  OQ 
processes. 


possible  exe- 
ement  i n  the 
All  of  the 
re  considered 
on  "Q(B,x)-*y" 
process  x  to 
query  Q(B,x) 
show  the  sup- 
I i sts  by  the 


Event  Process 

v  v  x  y  z  s 

1 

Extension 
Q(<l>, 1 )«v 

Q(<I>,V)+X 

Q(<l>,v)-*v 

2 

Extension 

Q(<i>,v)«v 

Q(<i>,v)-»v 

Extension 

Q(<l>,v)»x 

Q(<ly>,x)-»y 

Q(<lz>,x)-»z 

3 

Reflection 

Q(<i>,v)*v 

R(  <|>,V)-*V 

Extension 

Q(<iy>#x)*y 

Q(<iy>,y)+s 

Extension 

Q(  <lz>,x)»z 
<j{<lz>,z)-*s 
Q(<lz>,z)-»v 

■ 

Reflection 

Q(<iz>,z)»v 

R(<iz>,v)->z 

Col lation 

R(<i>,v)«v 

R(<l>,v)->v 

Extension 

Q(  < i z>, z )  • s 
Q(<iz>,s)-»v 

5 

Collation 

R(<i>,v)»v 

Extension 
Q(<lz>, s)«w 
Q(<lz>,v)-*v 

Col lation 
R(<lz>,v)»z 

Extension 

Q(<iy>,y)*s 

Q(<iy>.s)-*v 

■ 

Reflection 

Q(<iz>,v)av 

R(<lz>,v)-*v 

Extension 

Q(<iy>,s)«v 

Q(<iy>,v)-»v 

■ 

Reflection 

Q(<iy>,v)«v 

R(<ly>»v)-*w 

Col lation 

R(<iz>,v)«w 

R(<lz>,v)->s 

■ 

Collation 

R(<ly>,v)»w 

R(<ly>.v)-»s 

Col lation 
R(<iz>,v)«s 

R( <lz>, s )+z 

■ 

Col lation 

R(<lz>,s)»z 

R(<lz>,z)-»x 

Col lation 

R(<iy>,v)»s 

R(<iy>»s)-*y 

10 

Col lation 
R(<iz>,z)»x 
R(  < i >, x )+v 

Col lation 

R(<ly>,s).y 

R(<iy>,y)-»x 

11 

Collation 

R(<|>,X)«V 

R( <l>»  v )♦! 
(Deadlock) 

Ignored 

R(<iy>»y)'x 

Figure  5.  Example  Execution  Sequence. 
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Notes  on  the  sequence;  In  event  3,  two  messages 
are  sent  to  process  s:  Q(< i y>, y J+s  and 

Q( < i z>, zj+s.  Because  of  the  process  model 
restriction  on  message  arrival,  process  8  receives 
only  one  message  at  a  time.  Accordingly, 
Q(<iy>#y)*s  in  event  5,  and  Q(<iz>,z)»s  in  event 
4.  Both  messages  cause  extension  at  s  because  the 
queries  have  different  labels. 

After  event  4,  process  w  has  experienced  extension 
and  collation  (on  behalf  of  process  v's  query)  and 
w's  IQ  and  0Q- I i sts  are  again  empty.  Then  in 
event  5,  process  w  again  is  extending,  this  time 
on  behalf  of  Q(<iz>,s)aw. 

Different  cases  of  collation  appear  in  this 
sequence.  In  event  4,  collation  at  w  is  a  case  of 
one  reply  triggering  another  reply.  Event  5,  col¬ 
lation  at  z,  is  a  case  of  a  reply  causing  update 
to  an  OQ-list,  but  no  further  message  action. 
Event  11,  "ignored"  at  x,  is  explained  in  Section 
3.3.4. 

Here  is  a  I i at  of  the  contents  of  the  IQ  and 
OQ- 1  i  sts  aftijr  event  5  and  before  event  6.  The 
"*"  indicates  a  message  that  is  a  I  so  en  route 

between  events. 

Process  v  Process  w 

—  IQ . OQ -  —IQ . OQ - 

Q(< i >/ i )  Q(<i>,v)  Q( < i z>, 8)  Q(< i z>, w)* 

Process  x  Process  y 

—  —  IQ . OQ . IQ -  —  OQ - 

Q(<i>/V)  Q(<iy>,x)  Q(<iy>,x)  Q( < i y>, y ) 

Q(  < i z>, x ) 

Process  z  Process  s 

—  IQ . OQ . IQ . OQ - 

Q(  < i z>, x)  Q( < i z>, z )  Q( < i z>, z)  Q(<iz>,s) 

Q(<iy>,y)  Q(<iy>/8)« 

In  event  6,  process  v  will  reflect  Q(<iz>,w)»v 
because  Q(<i>,  i)  is  in  v’s  IQ- list,  and  <i>  ss 
<iz>.  Also  in  event  6,  process  w  wi I  I  extend 
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because  Q(<iy>,s)«w,  and  it  is  not  the  case  that 
<iz>  es  <iy>.  Therefore  Q(<iy>,s)  will  be  added  to 
process  w’s  IQ- list  and  Q(  <  i  y>,  w)-*-v. 
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4.0  IMPLEMENTATION  ISSUES 


The  following  sections  show  how  the  basic  deadlock 
detection  procedure  can  be  enhanced.  Section  4.1 
suggests  how  initiation  of  the  procedure  should  be 
controlled.  In  section  4.2  some  concurrency 
restrictions  are  given.  Sections  4.3  and  4.4 
offer  efficiency  improvements. 

4.1  Initiation 


Suppose  that  a  process  has  been  blocked  contin¬ 
uously  for  some  time  T,  where  T  is  a  performance 
parameter.  That  process  is  therefore  suspected  of 
being  deadlocked,  and  the  controller  initiates  the 
detection  procedure  for  that  process.  There  may 
be  many  processes  that  qualify  as  suspects,  and 
the  controller  could  initiate  the  detection  scheme 
for  each  of  the  suspects.  However,  within  each 
"deadlock  component"  of  the  PWFG,  it  is  sufficient 
for  a  controller  to  find  one  of  the  deadlocked 
processes,  and  it  is  desirable  to  limit  the  number 
of  initiations  to  reduce  message  traffic.  The 
controller  should  use  a  performance  parameter  K  to 
limit  the  rate  of  initiations  of  the  detection 
procedure.  If  this  rate  is  too  large,  then  much 
of  the  message  traffic  will  be  redundant,  because 
the  PWFG  will  not  be  changing  faster  than  the 
detection  procedure  executes. 

There  is  no  conflict  for  several  initiators  to 
concurrently  attempt  to  detect  deadlock:  Proc¬ 
esses  serve  any  number  of  deadlock  detection  com¬ 
putations  by  maintaining  the  IQ  and  OQ-lists.  If 
an  execution  of  the  procedure  does  not  detect 
deadlock,  one  result  is  that,  for  some  processes, 
the  IQ-list  and  OQ- I i st  will  retain  useless 
entries.  The  following  scheme  will  clean  up  the 
IQ  and  OQ-lists.  Suppose  some  initiator  is  cre¬ 
ated  to  detect  deadlock  for  suspected  process  w. 
Let  the  new  initiator  be  named  wK,  which  means 
initiator  for  process  w,  version  k.  The  previous 
(if  any)  initiator  for  process  w  was  named  wK-i, 
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and  successive  initiators  will  have  increasing 
version  numbers.  Then  each  time  a  process 
receives  a  query,  the  obsolete  IQ  and  OQ- I i st 
entries  can  be  expunged,  for  they  all  have  labels 
of  the  form  <Wj...>,  j  <  k. 

4.2  Concurrency  of  Process  Execution 

Concurrency  of  processing  at  different  sites  is 
expected  for  a  distributed  program.  The  procedure 
in  this  paper  also  permits  concurrent  execution 
within  a  site,  subject  to  the  following 
restriction:  Each  process  must  receive  a  message 
and  execute  an  action  (reflection,  collation, 
extension)  sequentially.  This  restriction  main¬ 
tains  the  integrity  of  the  IQ  and  OQ-lists,  as 
well  as  insuring  that  a  process  receives  only  one 
message  at  a  time.  Interleaving  the  actions  of 
different  processes,  and  parallel  execution  of 
different  processes  are  allowed. 

4.3  Sharing  Deadlock  Status 

The  procedure  may  be  improved  by  allowing  proc¬ 
esses  to  propagate  deadlock  status.  Once  a  con¬ 
troller  makes  the  determination  that  a  process  is 
in  deadlock,  all  queries  sent  to  that  process  may 
be  immediately  reflected  (all  queries  would  even¬ 
tually  be  reflected  anyway).  This  modification 
improves  the  performance  of  the  detection  proce¬ 
dure  under  concurrent  initiations  of  the  procedure 
by  different  processes. 

4.4  Reducing  Message  Traffic 

Processes  communicate  without  regard  to  site 
location.  The  execution  of  the  detection  proce¬ 
dure  amounts  to  control lers  sending  and  receiving 
many  small  messages.  In  order  to  reduce  the  mes¬ 
sage  traffic,  controllers  may  elect  to  "batch" 
communication,  i.e.,  retain  process  messages  and 
accumulate  packets.  This  consideration  may  be 
automatically  handled  as  part  of  the  communication 
protocol,  as  long  as  the  order  of  messages  is  log- 
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ically  preserved  for  i nter-process  communication 
(c. f.  Section  2.2) . 

Controllers  should  give  priority  to  message  traf¬ 
fic  within  a  site.  These  are  internal  operations 
for  a  controller.  By  preferring  the  internal  mes¬ 
sage  traffic,  the  controller  will  detect  local 
deadlocks  without  outside  communication. 
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5 . 0  SUMMARY 


We  have  presented  a  scheme  to  detect  deadlock  in 
distributed  systems.  A  sophisticated  model  of 
resource  requests  was  used.  The  structure  of  the 
procedure  is  distributed,  and  the  granularity  of 
the  distribution  is  of  the  same  order  as  the 
resource  requests.  The  scheme  is  dynamic,  and 
does  not  require  that  graph  structures  be  main¬ 
tained.  The  procedure  specifies  that  controllers 
concurrently  execute  the  detection  procedure,  and 
that  concurrency  within  a  controller  is  permitted. 
The  procedure  is  not  susceptible  to  "false  dead¬ 
lock"  detection,  and  we  prove  that  the  procedure 
is  correct. 
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6.0  APPENDIX:  PROOF  OF  DEADLOCK  DETECTION  PROCE¬ 
DURE 


The  goal  of  the  proof  is  defined  in  Section  6.1, 
and  a  local  definition  of  deadlock  is  specified. 
Some  auxiliary  definitions  are  found  in  Section 
6.2.  As  well  as  supporting  the  proof,  these  defi¬ 
nitions  are  useful  intuitive  bases  for  understand¬ 
ing  the  procedure.  The  subject  of  Section  6.3  is 
the  termination  of  a  tree  computation.  Section 
6.4  assigns  a  semantic  meaning  to  reply  messages 
and  proves  that  this  meaning  is  upheld  by  the 
detection  procedure.  In  Section  6.5  the  proof  is 
completed  by  verifying  correctness  conditions. 

6. 1  Criteria  for  Correctness 

The  deadlock  detection  procedure  is  not  an  algo¬ 
rithm.  That  is,  the  deadlock  situtation  will  be 
detected,  if  it  exists,  but  the  procedure  will  not 
detect  "no  deadlock"  situations.  Correctness 
means  that  the  deadlock  detection  procedure  must 
satisfy  the  following  conditions: 

P0  If  process  w  is  deadlocked  prior  to  v's 

initiation  of  the  detection  procedure, 
then  process  v  wi I  I  detect  deadlock  for 
process  w  after  a  finite  time. 

PI  If  process  v  detects  deadlock,  then  a 

deadlock  situation  truly  exists. 

PI  is  a  partial  correctness  condition,  and  P0  is  a 
termination  condition. 
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For  the  purpose  of  this  proof,  the  following  local 
definition  of  deadlock  is  employed:  Process  v  is 
deadlocked  iff  it  is  permanently  blocked  in  the 
PWFG.  More  precisely: 

Definition:  Process  v  is  deadlocked  iff 

•  Process  v  has  an  AND- request,  has  sent  a 

request  for  each  outgoing  edge,  but  for  at 
least  one  of  these  edges,  no  grant  wi I  I  ever 
be  received  by  process  v. 

•  Process  v  has  an  OR- request,  has  sent  a 

request  for  each  outgoing  edge,  but  no  grant 
will  ever  be  received  by  process  v. 

We  ca I  I  this  a  local  definition  because  it  does 
not  contain  the  definition  of  the  underlying  stat¬ 
ic  graph  structure  in  the  PWFG  that  causes  dead¬ 
lock.  The  local  definition  also  includes  the 
possibility  of  a  process  permanently  blocked  due 
to  starvation,  or  because  another  process  is  in  an 
infinite  loop.  We  exclude  these  possibilities 
from  consideration:  They  are  beyond  the  scope  of 
deadlock  detection. 

6.2  Aux i I i a ry  Def i n i t i ons 

6.2.1  Tree  Computation 

A  tree  computation  is  a  distributed  computation. 
The  tree  computation  grows  by  sending  queries  and 
shrinks  by  receiving  replies.  When  a  tree  compu¬ 
tation  shrinks  back  to  its  root,  it  terminates. 
The  deadlock  detection  procedure  uses  tree  compu¬ 
tations  to  search  for  active  processes  in  the 
PWFG.  The  paradigm  for  this  type  of  distributed 
computation  is  due  to  Dijkstra  and  Scholten  [8]. 

In  the  deadlock  detection  procedure,  the  tree  com¬ 
putation  has  an  additional  aspect.  It  can  be 
viewed  as  a  distributed  data  structure.  The  tree 
computation  consists  of  the  set  of  all  IQ  and 
OQ-list  elements  having  a  common  label. 
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Each  tree  computation  is  uniquely  associated  with 
a  label.  A  query  or  reply's  label  identifies  a 
tree  computation.  When  a  label  is  first  generated 
by  a  process  then  a  tree  computation  is  created. 
Creation  of  a  tree  computation  happens  in  two 
ways:  When  a  process  initiates  the  detection  pro¬ 
cedure,  and  when  extension  occurs  at  an 
AND- request i ng  process.  The  latter  event  causes 
new,  offspring  trees  to  be  generated.  The  next 
section  formalizes  this  relationship. 

6.2.2  Label  Descendancy 

A  descendancy  relation  is  defined  for  labels. 
Label  B  is  descended  from,  or  equal  to,  label  C 
i  f  f  C  8S  B.  Instead  of  the  prefix  operator,  the 
notation  BcC  will  be  used  to  mean  that  B  is 
descended  from  or  equal  to  C.  Note  that  all 
labels  for  an  execution  of  the  detection  procedure 
have  a  common  ancestor,  the  root  label,  which 
identifies  the  initiator  of  the  detection  proce¬ 
dure.  Given  labels  B  and  C,  any  process  can 
determine  whether  or  not  BCC. 

Since  labels  identify  tree  computations,  and  que¬ 
ries  and  replies  have  labels,  we  will  also  say 
that  tree  computations  are  related  by  descendancy. 

6.3  Termination  of  Tree  Computations 

Tree  computation  T  terminates  when  v,  the  process 
that  created  T,  obtains  a  reply  for  each  query 
that  v  sent  with  T's  label,  with  no  intervening 
grants.  That  is,  between  the  sending  of  a  query 
Q(T,v)  along  edge  (v,w),  and  the  receiving  of 
reply  R(T,w),  v  did  not  receive  a  grant  from  w. 

6.3.1  Tree  Computation  Termination  Lemma 

Tree  computation  T  terminates  iff  for  every  x  and 
y  such  that  Q(T,x)  is  sent  to  y,  then  R(T,y) 
arrives  at  x  with  no  intervening  grants. 

Proof:  First  we  show  that  non-term i nation  >  (in¬ 
tervening  grants  or  missing  replies):  Suppose  T 
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will  not  terminate.  Then  v0,  the  creator  of  T, 
sent  some  queries  Q(T,v#),  and 

1.  v#  will  not  obtain  all  replies  correspond i ng 
to  the  queries,  or 

2.  v(  receives  a  grant  on  some  edge  (v#,Vj) 
before  R(T,Vi)  is  received,  although  all 
replies  are  received  by  v0 . 

This  is  one  half  of  the  desired  equivalence. 

Second,  we  show  that  termination  ^  (no  intervening 
grants  or  missing  replies):  Now  suppose  T  termi¬ 

nates.  Then  v0,  the  creator  of  T,  sent  some  set 
of  queries  Q(T,v0),  and  obtained  corresponding 
replies  with  no  intervening  grants,  by  definition 
of  T's  termination.  Consider  some  arbitrary  proc¬ 
ess  Vt  that  did  not  reflect  vD's  query  to  vx 
(reflection  trivially  satisfies  the  result).  Vj 
therefore  collated  to  produce  a  reply  R(T,Vj)  sent 
to  V|.  If  vt  had  an  AND- request,  no  additional 
queries  on  T's  behalf  were  sent,  and  the  result  is 
complete.  Otherwise  v4  had  an  OR- request,  and 
replied  to  v,  because  it  sent  some  queries  Q(T,V!) 
and  obtained  corresponding  replies.  Moreover,  if 
a  grant  was  received  on  edge  (Vi,Vj)  between  the 
sending  of  Q(T,vi)  and  receiving  R(T,Vj),  then 
collation  would  not  have  occurred  (grants  diminish 
the  OQ- I i st  and  cause  replies  to  be  ignored).  An 
intervening  grant  is  ruled  out. 

Now  consider  some  arbitrary  process  v2  that  did 
not  reflect  vt ' s  query  to  v2  .  The  argument  may  be 
repeated  until  some  process  vK  is  reached,  and  vK 
obtained  all  of  its  replies  by  reflection.  Since 
the  path  v*,  Vj,  v2,  ...  vK  was  arbitrarily 
chosen,  all  queries  did  lead  to  replies  for  tree 
computation  T,  with  no  intervening  grants.  | 

6.4  The  Meaning  of  Reply 

The  following  assertion  can  be  made  at  the  point 
when  and  where  process  v  sends  reply  R(T,v)  to 
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process  w  in  response  to  Q(T,w),  previously  sent 
f  rom  w  to  v : 

1.  w  will  never  receive  (or  have  received)  a 
grant  from  v  after  w  sent  Q(T,w)  to  v,  or 

2.  Tree  computation  T,  or  some  ancestor  of  T, 
will  not  terminate. 

As  shorthand  for  this  rather  complicated 
assertion,  the  notation  {v\w\T}  will  be  used. 
Using  the  notation  of  Section  3.4, 

LO  R(T,v)-*-w  o  {v\w\T> 

is  an  invariant  assertion  for  the  deadlock 
detection  procedure.  Before  we  prove  this  invari¬ 
ant  in  the  next  section,  some  smaller  results  are 
g i ven  below. 

LI  R(T,v)*w  *  {v\w\T> 

LI  follows  from  LO,  because  R(T,v)-*-w  must  be  in 
response  to  some  Q(T,w)  sent  previously  from  w  to 
v.  If  a  grant  was  received  by  w  from  v  between 
Q(T,w)-»v  and  R(T,v)»w  then  T  wi  I  I  not  terminate  by 
the  tree  computation  termination  lemma.  LO 
asserts  that  no  later  grant  will  be  forthcoming 
unless  T  (or  ancestor)  fails  to  terminate,  so  LI 
i s  implied. 

L2  R(T,v)«w  and  w  has  AND  request  :> 

{v\w\U},  where  Tcu. 

R(T,v)*w  i  8  in  response  to  Q(  T,w)-»v.  If,  in 
between  these  two  events,  w  received  a  grant  from 
v,  then  T  wi I  I  not  terminate.  Otherwise  T  must 
terminate  because  w  created  T  exclusively  for  edge 
(w,v).  By  LI,  (v\w\T)  holds,  but  now  that  T  has 
terminated  the  assertion  is  {v\w\U),  where  U  is 
the  parent  of  T. 
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6.4.1  Verification  of  Invariant 

Replies  are  generated  by  reflection  and  collation. 
Collation  will  first  be  considered. 

6.4. 1.1  Col  I  at  ion 

Suppose  process  v  is  about  to  send  process  w  a 
reply,  R(T,v).  We  wish  to  show  {v\w\T}.  There 
must  have  been  a  query,  Q(T,w),  previously  sent 
from  w  to  v.  There  are  two  cases  to  consider: 

1.  v  has  an  AND-request.  Since  v  is  collating  to 
produce  a  reply,  v  must  have  sent  a  query 
Q(S,v)  to  some  process  z,  and  subsequently 
obtained  a  reply  R(S,z)  from  z,  where  ScT. 
There  cannot  have  been  any  grant  from  z  to  v 
between  the  sending  of  Q(S,v)  and  receiving  of 
R(S,z),  or  else  the  collation  would  not  occur. 
By  L 2,  R(S,z)«v  *  {z\v\T},  because  T  is  the 
immediate  ancestor  of  S. 

In  turn,  v  wi I  I  not  send  a  grant  to  w  unti I  v 
gets  a  grant  from  z  (recall  v  has  an 
AND-request).  This  would  imply 

non-termination  of  T  or  some  ancestor  of  T 
because  of  {z\v\T}.  If  v  has  already  sent  a 
grant  to  w  then  it  will  arrive  before  R(T,v) 
and  T  wi I  I  not  terminate  due  to  the  tree  com¬ 
putation  termination  lemma.  It  is  safe  to 
assert  {v\w\T}. 

2.  v  is  an  OR  vertex.  Since  v  is  collating  to 
produce  a  reply,  v  must  have  sent  queries 
Q(T,v)  to  some  processes  Zj,  z2,  ...  zn,  and 
subsequently  received  replies  R(T,Zi), 
R(T,z2),  ...  R(T,Zn).  Any  grant  from  a  proc¬ 
ess  Zj  cannot  have  been  received  at  process  v 
between  the  sending  of  Q(T,v)  and  receiving 
R(T,Zj),  or  collation  would  not  occur  (this 
also  rules  out  cancellation  of  any  (v,zK)).  v 
will  not  send  a  grant  to  w  unless  v  receives  a 
grant  from  some  zK.  But  (zK\v\T)  holds 
because  R(T,zK)  has  been  received,  with  no 
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intervening  grants 
{v\w\T} .  | 


so  it  is  safe  to  assert 


6.4. 1.2  Ref i ect i on 

Suppose  process  v  is  about  to  sent  R(T,v)  to  proc¬ 
ess  x  by  reflection.  We  wish  to  show  {v\x\T}. 
There  must  have  been  a  query  Q(T,x)  sent  by  proc¬ 
ess  x.  This  is  the  situation  (see  figure  6): 

Process  w  sent  query  Q(U,w)  to  process  v.  Process 
v  did  not  reflect  Q(U,w),  and  has  not  yet  sent  a 
reply  R(U,v)  to  w.  Now  v  has  received  query 
Q(T,x),  which  is  to  be  reflected  because  TeU. 


(w) 


Note:  TCU  and  VeU;  First  extension 
occurred  at  v,  and  now  reflection  is  to 
take  place. 


|  Figure  6.  Situation  for  Reflection, 

i _ 

If  process  x  receives  a  grant  from  v  before 
R(T,v),  then  T  wi I  I  not  terminate.  In  order  to 
send  a  grant  to  x  after  v  sends  R(T,v),  v  must 
become  active.  There  are  two  cases: 

1.  v  becomes  active  before  sending  R(U,v)  to  w. 
In  this  case  U  will  not  terminate. 
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2.  v  becomes  active  after  sending  R(U,v)  to  w. 
But  this  implies  that  v  received  a  reply, 
R(V,z)  from  some  process  z  before  a  grant  from 
jj  z  was  received  by  v.  The  assertion  {z\v\V}, 

and  the  grant  from  z  now  imply  that  U,  or  some 

I  ancestor  of  U,  will  not  terminate. 

In  either  case,  it  is  safe  to  assert  {v\x\T}.  | 

S' 

s  6.5  Verification  of  Correctness  Conditions 

W  - — - - - 

I  6.5.1  Verification  of  PI 

B  Suppose  process  i  is  the  initiator  and  declares 

B  deadlock  for  process  v  as  the  result  of  collation 

3  that  produced  R(T,v)-H.  Any  reply  R(S,zK)«v  that 

|  led  to  v's  collation  implies  {zK\v\S},  where  SeT. 

I  But  S  has  terminated,  T  has  terminated,  and  T  has 

,  no  ancestors.  Zk  will  therefore  never  send  a 

I  grant  to  v.  The  conclusion  is  that  v  is  truly 

deadlocked.  | 

6.5.2  Verification  of  PO 

We  verify  PO  by  assuming  that  the  deadlock 
detection  procedure  will  fail  to  detect  deadlock 
and  proving  a  contradiction  by  construction. 

Suppose  process  vt  is  deadlocked  and  receives  an 
initiator's  query,  but  deadlock  will  never  be 
detected.  Consider  two  cases: 

11.  If  V|  has  an  AND- request,  numerous  tree  compu¬ 
tations  were  created,  but  none  of  them  will 
terminate  because,  by  hypothesis,  deadlock 
will  never  be  detected.  It  is  possible  that  a 
tree  computation  will  not  terminate  because  v* 
received  a  grant  between  the  times  of  sending 
a  query  and  getting  a  reply;  this  cannot  be 
true  of  all  of  V|'s  tree  computations  --  the 
deadlock  assumption  would  be  violated.  v( 
therefore  sent  some  query  Q(T,,V|)  to  Vi ,  and 
will  never  receive  a  message  from  vt . 
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2.  If  v#  has  an  OR-request,  one  tree  computation 
was  created.  Under  the  deadlock  assumption 
and  the  non-detection  hypothesis,  v0  sent  some 
query  6(Tg,v#)  to  vx  and  will  never  receive  a 
message  from  vx . 

In  both  cases,  it  is  asserted  that  some  V|  will 
send  neither  a  grant  nor  R(T#,vx)  to  ve . 

Now  consider  vx ,  which  received  Q(T#,v0)  after  a 
finite  time  and  will  will  never  send  R(T0,vx)  to 
Vg .  vx  cannot  be  active,  since  activity  m i ght 
lead  to  v*  sending  Vg  a  grant.  That  is,  with  this 
deadlock  detection  procedure,  starvation  will  not 
be  detected.  There  are  now  two  cases  for  vx : 

1.  Vi  has  an  AND-request,  and  will  never  collate 
to  produce  a  reply  for  vB .  vx  therefore  sent 
some  query  Q(Tx,vx)  to  some  v2  and  vx  never 
received  R(Tx,v2).  We  may  also  assert  that  vx 
never  gets  a  grant  from  v2,  by  an  appropriate 
choice  of  v2 .  Contradiction  of  this  assertion 
violates  the  hypothesis.  TxcT, . 

2.  Vj  has  an  OR-request  and  will  never  collate  to 
produce  a  reply  for  vB  .  vx  therefore  sent 
some  query  Q(Tx,vx)  to  v2,  and  v2  will  never 
send  R(Tx,v2),  nor  will  v2  send  vx  a  grant, 
by  hypothesis.  TX=TB . 

In  both  cases,  it  is  asserted  that  some  v2  will 
send  neither  a  grant  nor  R(Tx,v2)  to  vx,  TxcT0. 

Now  consider  v2,  which  received  Q(Tx,vx)  after  a 
finite  time  and  will  will  never  send  R(Tx,v2)  to 
vx .  v2  cannot  be  active,  since  activity  might 
lead  to  v2  sending  vx  a  grant.  Nor  can  v2  be  part 
of  Tx  (or  any  ancestor  of  Tx),  because  reflection 
would  occur  and  vx  would  get  a  reply  in  finite 
time.  Therefore,  v2  will  not  reply  to  vx  because 
collation  did  not  occur  at  v2 .  The  case  argument 
for  vx  is  repeated,  as  above. 

Eventually  some  vK  will  be  reached,  and  all  of 
Vk's  queries  must  be  reflected,  simply  because  the 
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number  of  processes  is  finite.  If  Vk  will  never 
send  a  reply,  then  it  cannot  be  because  of  grants 
interfering  with  the  reflection  of  vK's  queries, 
for  then  deadlock  would  be  contradicted  by  the 
choice  of  Vj ,  v2,  ...  ,  vK,  and  hypothesis.  But 
all  of  vK's  queries  will  be  reflected  in  finite 
time,  so  vK  will  obtain  replies  in  finite  time. 
Therefore  vK  will  collate  to  reply  in  finite  time, 
but  this  violates  the  assumption  about  vK-  Con¬ 
tradiction.  | 
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